Ex_treme's blog.

基于用户的协同过滤算法改进(usercf 改进)

2018/11/22 Share

调试流程

1
2
3
4
5
6
7
8
9
10
11
12
13
14
def main_flow():
"""
main flow of itmecf
:return:
"""
user_click, user_click_time = reader.get_user_click(
"/home/pzs741/PycharmProjects/CollaborativeFiltering/data/ratings.csv")
item_info = reader.get_item_info(
"/home/pzs741/PycharmProjects/CollaborativeFiltering/data/movies.csv")
item_click_by_user = transfer_user_click(user_click)
user_sim = cal_user_sim(item_click_by_user,user_click_time)
debug_user_sim(user_sim)
# recom_result = cal_recom_result(user_click,user_sim)
# debug_recom_result(item_info,recom_result)

调试代码

  • 用户相似度调试代码
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15

def debug_user_sim(user_sim):
"""
print user sim result
:param user_sim:key userid value:[(userid1,score1),(userid2,score2)]
:return:
"""
topk = 5
fix_user = "1"
if fix_user not in user_sim:
print("invalid user")
return
for zuhe in user_sim[fix_user][:topk]:
userid, score = zuhe
print(fix_user + "\tsim_user" + userid + "\t" + str(score))
  • 用户推荐结果调试代码
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
def debug_recom_result(item_info, recom_result):
"""
print recom result for user
:param item_info: key itemid value:[title,genres]
:param recom_result: key userid value dict, value_key:itemid value_value:recom_result
:return:
"""
fix_user = "1"
if fix_user not in recom_result:
print("invalid user for recoming result")
return
for itemid in recom_result["1"]:
if itemid not in item_info:
continue
recom_score = recom_result["1"][itemid]
print("recom_result:" + ",".join(item_info[itemid]) + "\t" + str(recom_score))

算法改进1

image

1
2
3
4
5
6
7
def update_contribution_score(item_user_click_count):
"""
usercf user contribution score update v1
:param item_user_click_count: how many user have click this item
:return: contribution score
"""
return 1 / math.log10(1 + item_user_click_count)

算法改进2

image

1
2
3
4
5
6
7
8
9
10
11
def update_two_contribution_score(click_time_one, click_time_two):
"""
user cf user contribution score update v2
:param click_time_one:
:param click_time_two:
:return: contribution score
"""
delta_time = abs(click_time_two-click_time_one)
total_sec = 60 * 60 * 24
delta_time = delta_time / total_sec
return 1/(1 + delta_time)

调试结果

  • 改进1的调试结果

用户相似度展示
​ 1 sim_user313 0.18803307299446
​ 1 sim_user57 0.1879004753996153
​ 1 sim_user288 0.18756012667350208
​ 1 sim_user266 0.18399642220363496
​ 1 sim_user368 0.17954277437174224

推荐结果展示
​ recom_result:Heat (1995),Action|Crime|Thriller 0.18803307299446
​ recom_result:GoldenEye (1995),Action|Adventure|Thriller 0.18803307299446
​ recom_result:“City of Lost Children, The (Cité des enfants perdus, La) (1995)”,Adventure|Drama|Fantasy|Mystery|Sci-Fi 0.18803307299446
​ recom_result:Twelve Monkeys (a.k.a. 12 Monkeys) (1995),Mystery|Sci-Fi|Thriller 0.18803307299446
​ recom_result:Clueless (1995),Comedy|Romance 0.18803307299446
​ recom_result:Toy Story (1995),Adventure|Animation|Children|Comedy|Fantasy 0.1879004753996153
​ recom_result:“American President, The (1995)”,Comedy|Drama|Romance 0.1879004753996153
​ recom_result:Get Shorty (1995),Comedy|Crime|Thriller 0.1879004753996153
​ recom_result:Grumpier Old Men (1995),Comedy|Romance 0.18756012667350208
​ recom_result:Sense and Sensibility (1995),Drama|Romance 0.18756012667350208

  • 改进2的调试结果

用户相似度展示
​ 1 sim_user96 0.15791081894849407
​ 1 sim_user469 0.1172771823445635
​ 1 sim_user27 0.08282266726133267
​ 1 sim_user265 0.07282310151619956
​ 1 sim_user19 0.06379553900727523

推荐结果展示
​ recom_result:Toy Story (1995),Adventure|Animation|Children|Comedy|Fantasy 0.15791081894849407
​ recom_result:Babe (1995),Children|Drama 0.15791081894849407
​ recom_result:“Usual Suspects, The (1995)”,Crime|Mystery|Thriller 0.15791081894849407
​ recom_result:Braveheart (1995),Action|Drama|War 0.15791081894849407
​ recom_result:Apollo 13 (1995),Adventure|Drama|IMAX 0.15791081894849407
​ recom_result:Heat (1995),Action|Crime|Thriller 0.1172771823445635
​ recom_result:“American President, The (1995)”,Comedy|Drama|Romance 0.1172771823445635
​ recom_result:“City of Lost Children, The (Cité des enfants perdus, La) (1995)”,Adventure|Drama|Fantasy|Mystery|Sci-Fi 0.1172771823445635
​ recom_result:Twelve Monkeys (a.k.a. 12 Monkeys) (1995),Mystery|Sci-Fi|Thriller 0.1172771823445635
​ recom_result:Jumanji (1995),Adventure|Children|Fantasy 0.08282266726133267
​ recom_result:Pocahontas (1995),Animation|Children|Drama|Musical|Romance 0.08282266726133267
​ recom_result:“Indian in the Cupboard, The (1995)”,Adventure|Children|Fantasy 0.08282266726133267

CATALOG
  1. 1. 调试流程
  2. 2. 调试代码
  3. 3. 算法改进1
  4. 4. 算法改进2
  5. 5. 调试结果